Incident Report: Dead Gateway Alerts
Date: 2024-01-24
Time: 3:06 PM (GMT+3)
Duration: 5 days 5 hours 45 minutes
Description
Various Dead Gateway errors have been detected through OpsGenie.
Root Cause
The root cause of the issue is identified as the deployment of new gateways and the failure to remove old gateways from the database. Aaron manually pruned out the old gateways.
Impact
These gateway errors has led to alerts and potential disruptions in monitoring.
Timeline
- 15:06(01-24) - Mertcan has done changes and notified team through slack.
- 20:58(01-28) - Aaron has noticed the occurrence of dead gateways.
- 23:54(01-28) - It is identified that SPY/USD and CHF/USD dead gateways are related to the issue.
- 20:51(01-29) - Aaron has found that the failure to remove old gateways from the database has caused the problem.
Lessons Learned
Pruning old gateways manually may be required after the deployment of new signed-api gateways.
Actions Taken
Acknowledgment of alerts for gateway errors initiated. Investigation into dead gateway alerts and pruning of old gateways have done by Aaron.
Related Images/Logs
Incident Reviewer(s)
Mertcan, Aaron, Andrew, Arda